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ABSTRACT 

Context. Observations of intergalactic neutral hydrogen can provide a wealth of information about structure and galaxy formation, potentially 
tracing accretion and feedback processes on Mpc scales. Below a column density of Nm ~ 10^^ cm~^, the "edge" or typical observational limit 
for Hi emission from galaxies, simulations predict a cosmic web of extended emission and filamentary structures. Current observations of this 
regime are limited by telescope sensitivity, which will soon advance substantially. 

Aims. We study the distribution of neutral hydrogen and its 21cm emission properties in a cosmological hydrodynamic simulation, to gain 
more insights into the distribution of H i below A^hi ~ 10^^ cm~^. Such Lyman Limit systems are expected to trace out the cosmic web, and are 
relatively unexplored. 

Methods. Beginning with a 32 Mpc simulation, we extract the neutral hydrogen component by determining the neutral fraction, including 
a post-processed correction for self- shielding based on the thermal pressure. We take into account molecular hydrogen, assuming an average 
density ratio Qh2/^hi = 0.3 at z = 0. The statistical properties of the Hi emission are compared with observations, to assess the reliability of 
the simulation. We then make predictions for upcoming surveys. 

Results. The simulated Hi distribution robustly describes the full column density range between A^hi ~ 10^"^ and A^m ~ 10^^ cm~^ and agrees 
very well with available measurements from observations. Furthermore there is good correspondence in the statistics when looking at the two- 
point correlation function and the H i mass function. The reconstructed maps are used to simulate observations of existing and future telescopes 
by adding noise and taking account of the sensitivity of the telescopes. 

Conclusions. The general agreement in statistical properties of H i suggests that neutral hydrogen as modeled in this hydrodynamic simulation 
is a fair representation of that in the Universe. Our method can be applied to other simulations, to compare diff'erent models of accretion and 
feedback. Future H i observations will be able to probe the regions where galaxies connect to the cosmic web. 

Key words. Structure formation - Simulations - Inter Galactic Medium - Cosmology 



1. Introduction 

Current cosmological models ascribe onl y about 4% o f the 
density in the Universe to baryons (S pergel et al. I l2007h . The 
majority of these baryons reside outside of galaxies; stars 
and cold galact ic gas may account for about about one third 
( Fukugita et al ][l998). Intergalactic baryons have historically 
been traced in absorption, such as the Lya forest arising from 
diflTuse photoionized gas that may account for up to 30% 
of baryons. The remaining baryons are predicted to exist in 
a wa r m-hot intergalactic medium (WHIM) (e.g. IPave et alJ 
19991 1200 ll: ICen & Qstrike3 Il999h . which is shock-heated 
during the collapse of density perturbations that give rise 
to the cosmic web. Still, absorption probes yield only one- 
dimensional redshift- space information, or in rare cases sev- 
eral probes through common structures. Mapping intergalac- 
tic baryons in emission can in principle provide morphological 



and kinematic information on accreting (and perhaps outflow- 
ing) gas within the cosmic web. 

Unfortunately, emission from intergalactic baryons is diffi- 
cult to observe, because current telescope sensitivities result in 
a detection limit of column densities A^hi ~ 10^^ cm"^, which 
are the realm of Damped Lya (DLA) systems and sub-DLAs. 
Below column densities of ~ Nui ~ 10^^-^ cm"^, the neu- 
tral fraction of hydrogen decreases rapidly due to the transi- 
tion from optically-thick to optically-thin gas ionized by the 
metagalactic ultraviolet flux. At lower densities the gas is no 
longer afl'ected by self shielding and the atoms are mostly ion- 
ized. This sharp decline in neutral fraction fror n almost unity 
to les s than a percent happens within a few kpc (jPove & Shulll 



19941) . Below Nm ~ 10^^^ cm ^ the gas is optically thin and 
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the decline in neutral fraction with total column is much more 
gradual. A consequence of this rapid decline in neutral frac- 
tion is a plateau in the H i column density distribution function 
between A^hi ~ 10^^-^ and Nui ~ 10^^-^ cm"^, where the rela- 
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tive surface area at these columns shows only modest growth. 
This beh aviour is confirmed in QSO absorption studies tabu- 
lated by ICorbelli & Ban dieral ( 2002 ) and in Hi emission by 
Braun & Thilker (2004). Below A^hi ~ 10^^-^ cm'^ the rela- 
tive surface area increases rapidly, reaching about a factor of 
30 larger at A^hi ~ 10^^ compared with A^hi ~ 10^^ cm"^. 

This plateau in the distribution function is a critical is- 
sue for observers of neutral hydrogen in emission. Although 
telescope sensitivities have increased substantially over the 
past decades, the detected surface area of galaxies observed in 
the 2 1cm line has only increased modestly (eg. Dov e & Shulll 
I994J). Clearly there is a flattening in the distribution func- 



tion near A^hi ~ 10 cm" which has limited the ability of 
even deep observations to detect hydrogen emission from a 
larger area. By establishing that a steeper distribution function 
is again expected below about A^hi ~ 10^^-^, it provides a clear 
technical target for what the next generation of radio telescopes 
needs to achieve to efl'ectively probe difl'use gas. 

Exploration of the A^hi < 10^^'^ cm"^ regime is essential 
for gaining a deeper understanding of the repository of baryons 
that drive galaxy formation and evolution. This gas, residing in 
filamentary structures, is the reservoir that fuels future star for- 
mation, and could provide a direct signature of smooth cold- 
mode accretion predicte d to dominat e gas acqu isition i n star- 
forming galaxies today dKeres etal.l 12005: Dek el et al.l E008: 
Keres et al. II2OO8). Furthermore, the trace neutral fraction in 
this phase may provide a long-lived fossil record of tidal in- 
teractions and feedback processes such as galactic winds and 
AGN-driven cavities. 

Several new large facilities to study 21cm emission are un- 
der development today. In view of the observational difficul- 
ties in probing the low H i column regime, it is particularly im- 
portant to have reliable numerical simulations to aid in plan- 
ning new observational campaigns, and eventually to help in- 
terpret such observations within a structure formation context. 
While simulations of galaxy formation are challenging, histor- 
ically they have had much success predic ting the more diff ^use 
baryons residing in the cosmic web (e.g. IPave et al] ^1999). If 
such simulations display statistical agreement with key existing 
H I emission data, then they can be used to make plausible pre- 
dictions for the types of structures that may be detected, along 
with suggesting optimum observing strategies. 

In this paper, we employ a state-of-the art cosmological 
hydrodynamic simulation to study H i emission from filamen- 
tary large-scale structure and the galaxies within them. The 
simulation used here include a well-constrained prescription 
for galactic outflows that has been shown to reproduce the 
observ ed metal and H i absorption line proper ties from z ~ 
6^0 (iQppenheimer & Davel2006ll2008[l2009k . We develop a 
method to produce H i maps from these simulations, and com- 
pare statistical properties of the reconstructed Hi data with 
the statistics of real Hi observations, to assess the reliability 
of the simulation. For this purpose we will priniarily u se the 
H I Parkes All Sky Survey (HIPASS^ lBarnes et al.1 (l200lb since 
this is the largest available H i survey. This work is intended 
to provide an initial step towards a more thorough exploration 
of model constraints that will be enabled by comparisons with 
present and future H i data. 



Note that current spatial resolution of simulations hav- 
ing cosmologically-representative volumes cannot reproduce 
a galaxy as would be seen with H i observations having sub- 
kpc resolution. Therefore we do not consider the internal kine- 
matics or detailed shapes of the objects associated with simu- 
lated galaxies. We can only assess the statistical properties of 
the difl'use H i phase and predict how the gas is distributed on 
multi-kpc scales. We particularly focus on lower column den- 
sity material that may be probed with future H i surveys, which 
primarily reside in cosmic filaments within which galaxies are 
embedded. 

Our paper is organized as follows. In section two we briefly 
describe the particular simulation that has been analyzed. In 
section three we describe our method to extract the neutral hy- 
drogen from the simulations. The neutral fraction is determined 
from both a general ionization balance as well as a local self- 
shielding correction. We also model the transition from atomic 
to molecular hydrogen We present our results, showing the sta- 
tistical properties of the recovered H i in section four, where 
they are compared with similar statistics obtained from obser- 
vations. The distribution of neutral hydrogen is compared with 
the distribution of dark matter and stars in this section as well. 
In the fifth section we discuss the results and outline the impli- 
cations. Finally, section six reiterates our main conclusions. 



2. Simulation Code 

A modified version of the N-body-i-hydrodynamic code 
Gadget-2 is employed, which uses a tree-particle-mesh algo- 
rithm to compute gavitational forces on a set of particles, 
and an entropy-conserving formulati on of S n aoothe d Particle 
Hydrodynamics (SPH: S pringel & HernquistI (l2002h ) to simu- 
late pressure forces and shocks in the baryonic gaseous parti- 
cles. This Lagrangian code is fully adaptive in space and time, 
allowing simulations with a large dynamic range necessary 
to study both high-density regions harboring galaxies and the 
lower-density IGM. It includes a pr escription for star forma- 
tion following Springel & HernquiI3 ( 2003b) and galactic out- 
flows as described below. The code has been described in de- 
tail in Qppenh eimer & Dave 12006) and Oppen heimer & Davd 
(.2008 ): we will only summarise the properties here. 

The novel feature of our simulation is that it includes a 
well-constra ined model for galactic outflow s. The implementa- 
tion follows ISpringel & Hernquist (l2003al) . but employs scal- 
ings of outflow speed and mass loading factor with galaxy 



mass as expected for momentum-driven winds (|Murray et al 



2005|) . Our simulations using these scalings have been shown 
to successfully reproduce a wide range of IGM and galaxy 
data, including IGM enrichment as traced by z ~ 2-6 
C IV absorbers (Oppenheimer & Dave 2006), the galaxy mass- 
metallicity relation ( Finlator & Dave 2008|, the early g alaxy 
luminosity function and its e volution (Dave et al .1120061) . Ovi 
absorption at low-z (^O ppenheimer & Dave..2009ir and enrich- 
ment and entropy levels in galaxy groups Dave et al.l (12008k . 
Such outflows are expected to impact the distribution of gas 
in the larg e-scale structure around galaxies o ut to typically 
~ lOOkpc dOppenheimer & Davd [20081 l2009k . so are impor- 
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tant for studying the regions expected to yield detectable H i 
emission. 

The simulation used here is run with cosmologic al param 



eters consistent with the 3-year WMAP results ( Sp ergel et al 



Only photons more energetic than hv > 13.6 eV can ionize 
hydrogen. The io nization equilibrium equation is given by e.g. 
lOsterbrockl ([19891) as: 



2QQ7I) . The parameters are Qq = 0.25, Da = 0.75, = 0.044, rm f ^^a(y)dy = neUpfiiT), 
Ho = 71 km s"^ Mpc-\ o-g = 0.83, and n = 0.95. The peri- Jyo 



(1) 



odic cubic volume has a box length of 32 /z"^ Mpc (comoving), 
and the gravitational softening length is set to 2.5 kpc (co- 
moving). Dark matter and gas are represented using 256^ par- 
ticles each, yielding a mass per dark matter and gas particle of 
1.57 X lO^Mo and 3.35 x 10^ M©, respectively. The simulation 
was started in the linear regime at z = 129, with initial condi- 
tions established using a random realization of th e power spec- 
trum computed following lEisenstein & Hul (119991) . and evolved 
to z = 0. 



3. Method for Making Hi Maps 

We now describe the algorithm used to extract the neutral hy- 
drogen component from this simulation. The method developed 
is general, and can be applied to any simulation that has a sim- 
ilar set of output parameters. In our analysis, we set the total 
hydrogen number density to nu = O.lApg/mu where mu is the 
mass of the hydrogen atom. The factor 0.74 assumes a helium 
abundance of F = 0.26 by mass, and that all the helium is in 
the form of He i and He ii with a similar neutral fraction as hy- 
drogen. Apart from this factor the presence of helium is not 
taken into account in our calculations. We will describe how 
we determine the neutral fraction of the gas particles, includ- 
ing applying a correction for self shielding in high density re- 
gions and taking into account molecular hydrogen formation 
where relevant. This allows reconstruction of the neutral hy- 
drogen distribution, by mapping the particles onto a three di- 
mensional grid. 

3.1. Neutral Fraction of Hydrogen 

We begin by calculating the neutral fraction from the density 
and temperature of gas in the simulations, together with the H i 
photo-ionization rate provided by the cosmic UV background. 
Dove&Shul3(ll994l) found that the radial structure of the col- 
umn density of H i is more sensitive to the extragalactic radi- 
ation field than to the distribution of mass in the host galaxy. 
When calculating the neutral fraction, we assume that all pho- 
toionization is due to radiation external to the disk and that in- 
ternal stellar sources are not significant. In this case the nebular 
model as described in lOsterbrockId 1989k is a very good approx- 
imation, since the typical number density in the outer parts of 
galaxies, approximately 10"^ cm"^, is so low that collisional 
ionization is negligible. When going further out, the densities 
become even lower. Inside galaxies the volume densities are so 
high, that the neutral fraction is of order unity owing to self- 
shielding. 

In the IGM, hydrogen becomes ionized when the extreme 
ultraviolet (UV) radiation ionizes and heats the surrounding 
gas. On the other hand, the recombination of electrons leads 
to neutralization. The degree of ionization is determined by the 
balance between photo-ionization and radiative recombination. 



where Jy is the mean intensity of ionizing photons, a(y) is the 
ionization cross section and I3(T) is the recombination rate co- 
eflftcient to all levels of atomic hydrogen at temperature T. vq 
is the ionization threshold frequency. Only radiation with fre- 
quency y > vo is eff'ective in photoionization of hydrogen from 
the ground level. Summarising, the integral represents the num- 
ber of photoionizations per unit time and the right hand side of 
the equation gives the number of recombinations per unit time. 
The neutral hydrogen, nu, electron, rie, and proton, rip, densi- 
ties are related through the charge conservation and hydrogen 
abundance equations. 



ne = np = (1 -^)n 

where ^ is the neutral fraction, so 

nu = ^n 



(2) 



(3) 



with n the total density. 

We can write the ionization balance for neutral Hydrogen 

as 



^nTui = (l-0^nW) 



(4) 



where Fhi is the ionization rate for neutral hydrogen. 

With this equation it is easy to determine the neutral fraction, 

which is given by 



2C -h 1 - V(2C-hl)2-4C2 



2C 



usmg 



C 



Tui 



(5) 



(6) 



Obviously the neutral fractions that we calculate are closely 
related to the values we use for the photoionization and re- 
combination rate. The photoionization rate at low redshift is 
not well constrained observationally ; by co mbining Lya forest 
data and simulations, IPave & TrippI (1200 ik obtain a photoion- 
ization rate of Phi ~ lO-^^-^^^-^ s'^ for redshift z ~ 0.17. We 
will use the photoioniza tion rate given by the CUBA model of 



Haardt&Madaul (12001k . which is Phi ~ 10"^^ s'^ for z ~ 0. 



The recombination rate coefficients are dependent on tem- 
perature. We make use of an analytic function described by 
Verner & Ferlandl ( 1996k . that fits the coefficients in the tem- 
perature range form 3 K to 10^^ K: 



P(T) = a 



-1 



(7) 



where a, b, Tq and Ti are the fitting parameters. For the H i 
ion the fitting parameters are: a = 7.982 x 10"^^ cm^ s"\ 
b = 0.7480, To = 3.148 K and Ti = 7.036 x 10^ K. 
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Fig. 1. Neutral fraction as a function of density for different 
temperatures between 10^ and 10^ K. 



The neutral fraction is plotted for different temperatures as 
function of density of H atoms in Fig.[T] For temperatures be- 
low 10"^ K, the neutral fraction is still significant (a few percent) 
at reasonably low densities of 0.01 cm"^ but at higher temper- 
atures most of the gas is ionized, and the neutral fraction drops 
very quickly. 

3.2. Molecular Hydrogen 

When gas has cooled sufficiently, it coexists in the molecular 
(H2) and atomic (H i) phases. The H2 regions are found in dense 
molecular clouds where star formation occurs. Unfortunately 
there is large uncertainty in the average amount of H2 in galax- 
ies, as estimates have to rely on indirect tracers and conversion 
factors, for which the dependancies are not well-understood. 
As a result, there is substantial variance in estimates of the 



average density ratio at z = 0, tJu 



/Q^. (e.g. 

0.42 and 0.26 stated respective ly by ^Keres et al.l (|2003|) and 



Qbreschkow & Rawlingsl ( 2009|) ). It is beyond the scope of this 
paper to revisit these determinations, therefore we will adopt a 
value of Q.H2l^Hi = 0.3 that falls within the error bars of cur- 
rent estimates. Given the observed local value o f the atomic 
mass density, of pm = 6.1 x 10^ h M© Mpc"^ (IZwaan et alJ 
20031), this implies a molecular mass density of = 1.8x10^ 
/iMq Mpc"^. 

To define the regions of molecular hydrogen, we use 
a threshold based o n the thermal pressu r e (P/ k = nT). 
Wong & Blitl (l2002h. ISlitz & Rosolowskvl (l2004l) and more 
recently Blitz & Rosolowskvl (12006 ^ have made the case that 
the amount of molecular hydrogen that is formed in galaxies is 
determined by only one parameter, the interstellar gas pressure. 
In hydrostatic pressure equilibrium, the hydrostatic pressure is 
balanced by the sum of all contributions to the gas pressure: 
magnetic pressure, cosmic ray pressure and kinetic pressure 
terms (of which the thermal pressure is relatively small) (e.g. 
Walterbos & Braunl fl996) and references therein). However, 
thermal pressure is directly coupled to energy dissipation via 



radiation, and therefore thermal pressure can track the total 
pressure due to various equipartition mechanisms. An evalu- 
ation of the various contribution s to the total hydrostatic pres- 
sure is given by Bou lares & Coxl ( 1990|) . 

Two lines of constant thermal pressure are shown in Fig. [2] 
where temperatures are plotted against density for individual 
particles in the simulation. When following these lines, they 
cross two regions, one with high densities and low temperatures 
and one with moderate densities, but very high temperatures. 
These two regions are distinguished by the solid green line in 
Fig. [21 where the radiative recombination time is equivalent to 
the sound-crossing time on a kiloparsec scale. The radiative 
recombination time is given in .Tielens. (.2005^) by: 



Tree - 2 X 10^ years 

He 



(8) 



where He is the electron density which is comparable to the to- 
tal density for low neutral fractions. The sound crossing time 
is given by ts = R/Cs, where R is the relevant scale (assumed 
here to be one kpc) and Cs - 1.4 x lO^r^^^ cm s"^ is the sound 
velocity. All particles right of the green line have a recombina- 
tion time that is shorter than the sound crossing time. Particles 
with a recombination time that is larger than the sound crossing 
time are unlikely to be neutral or molecular. 

For each particle the thermal pressure can be calculated 
and particles with a pressure exceeding the threshold value 
and satisfying Tree < are considered molecular. By explor- 
ing different pressure values as shown in Fig. [3l the thresh- 
old can be tuned to yield the required molecular mass density, 
PH2 = 1.8 X 10'^ /z Mo Mpc"^. The threshold thermal pres- 
sure value we empirically determine is P/k = 810 cm"^ K. 
We must stress, that this value is very likely not a real physi- 
cal value, as the resolution in our simulation is not sufficient to 
resolve the scales of molecular clouds. Molecular clouds have 
smaller scales with higher densities, which will likely have sig- 
nificantly enhanced pressures. 

3.3. Correction for Self-Shielding 

Although the ionization state and kinetic temperature are deter- 
mined self-consistently within the simulation, it has been nec- 
essary to assume that each gas particle is subjected to the same 
all-pervasive radiation field. At both extremely low and high 
particle densities this approximation is sufficient, since local 
conditions will dominate. However, at intermediate densities, 
the "self- shielding" of particles by their neighbours may play a 
critical role in permitting local recombination, when the same 
particle would be substantially ionized in isolation. Present cos- 
mological simulations are not capable of solving the full radia- 
tive transfer equations, although it is now becoming possible 
to post-process rad iative transfer on individual galaxies (e.g. 
Pontzen et al.ll2008l) . Because we want to study emission from 
the IGM as well as galaxies, we must instead adopt a simple 
correction based on density and temperature to approximate 
self- shielding. We adopt a similar approach to the one that was 
used to model the atomic to molecular transition above, using 
the thermal pressure as a proxy for the hydrostatic pressure. 
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Fig. 2. Temperatures are plotted against densities for every 
200^^' particle in the simulation. The dashed (blue) and dash- 
dotted (red) lines correspond to constant thermal pressures of 
Pjk =155 and 810 cm"^ K, that were found empirically to re- 
produce the observed mass densities of atomic and molecular 
gas at z = 0. The solid (green) line shows where the recombina- 
tion time is equal to the sound-crossing time at a physical scale 
of one kpc. Particles above/left of the green line are unlikely to 
be neutral or molecular. 




3 1 ^ u ^ 1 

50 100 150 200 250 300 

P/k [cm"^ K] 

Fig. 4. The average H i mass density at z = is plotted against 
the threshold thermal pressure, where atomic hydrogen is as- 
sumed to recombine from ionized, while also satisfying the 
condition that Tree < ts (and accounting for a molecular den- 
sity ofp//2 = l.SxlO^/zMoMpc"^). AtapressureofP/^ = 155 
cm"^ K (dashed line), pm = 6.1 x 10^ /z M© Mpc"^ (dotted 
line), which is consistent with the H i density from lZwaan et al.l 
(I2OO3I) . 




M© Mpc"^ as discussed above. We subsequently calculate the 
atomic density as function of the thermal pressure threshold, as 
shown in Fig.lH It is empirically found, that a threshold value 
of Pjk =155 cm"^ K results in an H i density of p = 6.1 x 10^ 
h M© Mpc"^, that is similar to the derived value in'Zwaa n et al 



(^003). We will adopt this threshold value for our further anal- 
ysis. 

The typical densities and temperatures where self- shielding 
becomes important are not accurately defined. When looking 
at Fig. 121 the typical temperatures and densities which satisfy 
our empirical thermal pressure criterion for local recombina- 
tion are temperatures of ~ 10^ K and densities of ~ 0.01 
cm"^ . These values agree we l l with various estimates fro m lit- 
erature (e.g. ' Weinberg et aP (ll997hlWolfire et al.1 (l2003k . and 
■Pelupessv. (.20051) .) 



Fig. 3. The average molecular density at z = is plotted against 
the threshold thermal pressure, where molecular hydrogen is 
assumed to form from atomic, while also satisfying the con- 
dition that Tree < The dashed line indicates a pressure of 
P/y^ = 810 cm"^ K, where p//^ = 1.8 x 10"^ /z M© Mpc"^ which 
is shown by the dotted line. 



3.4. Gridding Method 

To reconstruct the density fields, we have employed a grid- 
based method, in which the value of the density field is cal- 
culated at a set of locations defined on a regular grid. The mass 
of each particle is spread over this grid in accordance with a 
particular weighting function W, to yield 



Only gas at a sufficiently high thermal pressure and for which 
the recombination time is shorter than the sound crossing time 
on kpc scales is assumed to recombine. Particles that satisfy 
the pressure and timescale condition are considered to be fully 
self- shielded, and their neutral fraction is set to unity. 

We will assume that the highest pressure regions which 
satisfy Tree < '^s have already provided = 1.8 x 10^ /z 



(9) 



where n = (rix, Uy, denotes the grid-cell, M is the number 



of cells of the grid in each dimension, 
particles and mt is the mass of particle /. 



N is the number of 
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We adopt the weighting function directly from what is 
used for SPH in Gadget-2, na mely a spline kernel defined by 
Monaghan & Lattanziol d 1985k : 



1-6 



(10) 



0, 



> 1 



where r is the distance from the position of a particle and h 
is the smoothing length for each particle (which in Gadget-2 
is the radius that encloses 32 gas particle masses). Furthermore 
we set a limit to the size of the smoothing length: the smoothing 
length of a particle has to be at least 1.5 times the resolution of 
the grid-cells, which means that a particle is distributed over at 
least three grid-cells in each dimension. This adversely aff'ects 
the highest density regions in the reconstructed field (when in- 
suflftcient gridding resolution is employed), but gives a more re- 
alistic representation of resolved objects and transitions with- 
out shot noise or step functions. We note that our procedure 
explicitly conserves total mass. 

4. Results 

Reconstructed density fields are gridded in three dimensions, 
for the total hydrogen component (ionized plus neutral) the 
neutral component and the molecular component. This makes it 
possible to compare the distribution of the total and neutral hy- 
drogen budget and permits determination of neutral fractions 
for the volume and column densities. Initially the full 32 /z"^ 
Mpc cubes is gridded with a cell size of 80 kpc. This allows 
visualization of the distribution on large scales and determina- 
tion of the average density of neutral hydrogen in the simula- 
tion volume Phi. The degree of clustering can be determined 
by looking at the two-point correlation function. However, this 
low resolution grid is not suitable to resolve the high density re- 
gions and small structures, as we will describe later. High den- 
sity regions of the simulation volume were selected and grid- 
ded with a cell- size of 2 kpc. The H i column density distribu- 
tion function and the H i mass function can be determined from 
these regions. The properties of the simulated H i gas will be 
described and the statistics will be compared with the statisti- 
cal properties of observatio nal data, mos t ly from the H i Parkes 
All Sky Survey (HIPASS) dBarnes et al.1 (l200lk ). 

Apart from gas or SPH particles, the simulations contain 
star and dark matter particles as well. We will adopt a rela- 
tively simple gridding scheme to reconstruct the distribution of 
stars and dark matter. This can be very useful to verify whether 
the stars, but especially the gas (or reconstructed H i) trace the 
distribution of Dark Matter. 



4.1. Mean Hi Density 

The average H i density is an important property, as this single 
number gives the amount of neutral hydrogen that is recon- 
structed without any further analysis. The H i density is very 



well determined fr om the 1000 brightest HIPASS galaxies in 
Zwaan et al. I (l2003k They deduce an H i density due to galaxies 
in the local universe of pm = (6.9 ± l.l) - 10^ h M© Mpc"^ or 
Phi = (6.1 ± 1.0) • 10^ h M© Mpc"^ when taking into account 
biases like selection bias, E ddington eff'ect, H i self a b sorpti on 
and cosmic variance. From [Rosenberg & Schneiden (|2002|) a 
value of Phi = 7.1 • 10^ h M© Mpc"-^ can be derived for the 
average H i density in the universe. We will adopt the value of 
Pjjj = (6.1 ± 1.0) • 10^ h M© Mpc"^. The pressure thresholds 
for molecular and atomic hydrogen are tuned to reproduce this 
density as is described earlier in this paper. 

4.2. H I Distribution Function 

As mentioned above, the low and intermediate column densi- 
ties (Nui < 10^^ cm"^) do not have a very significant contribu- 
tion to the total mass budget of H i. 

For comparison with our simulation, the Hi distribution 
function derived from QSO absorption line data will be used as 
tabulated in lCorbeUi & Bandieral (l2002h . For the QSO data the 
column density distribution function /(A^hi) is defined such that 
f(Nui)dNuidX is the number of absorbers with column density 
between Nui and Nui + dNui over an absorption distance in- 
terval dX. We derive f(Nui) from the statistics of our recon- 
structed H I emission. The column density distribution function 
in a reconstructed cube can be calculated from. 



f(Nm) = 



A(Nni) 2 
cm , 



Hodz dNm 



(11) 



where dX = dzHo/c and A(A^hi) is the surface area subtended 
by H I in the column density interval dNm centered on A^hi- 

As the simulations contain Hi column densities over the 
full range between A^hi = 10^^ and 10^^ cm"^, we can plot the 
H I column density distribution function /(Nui) over this en- 
tire range with excellent statistics, in contrast to what has been 
achieved observationally. In the left panel of Fig. [5] we overlay 
the H I distribution functions we derive from the simulations 
with the da ta values obtained from QS O absorption lines as 
tabulated bv lCorbeUi & Bandieral (l2002 k (black dots). The hor- 
izontal lines on the QSO data points correspond to the bin- size 
over which each data point has been derived. Vertical error bars 
are not shown, as these have the same size as the dot. Around 
A^Hi = 10^^ cm"^ there is only one data bin covering two or- 
ders of magnitude in column density, illustrating the difficulty 
of sampling this region with observations. This corresponds to 
the transition between optically thick and thin gas, where only 
a small increase in surface covering is associated with a large 
decrease in the column density. 

The dashed (red) line corresponds to data gridded to a 
80 kpc cell size. At low column densities the simulated dis- 
tribution function agrees very well with the QSO absorption 
line data. The transition from optically thick to optically thin 
gas happens within just a few kpc of radius in a galaxy disk 
dPove & Shulll 1 1994) . Clearly a reconstructed cube with a 
80 kpc cell size does not have enough resolution to resolve 
such transitions. Some form of plateau can be recognised in 
the coarsely gridded data above A^hi = 10^^ cm"^, however it is 
not a smooth transition. Furthermore because of the large cell 
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size, no high column density regions can be reconstructed at 
all. The cores of galaxies have high column densities, but these 
are severely diluted within the 80 kpc voxels. 

To circumvent these limitations, structures with an H i mass 
exceeding 5x10^ M© in an 80 kpc voxel have been identified 
for individual high resolution gridding. This mass limit is cho- 
sen to match the mass-resolution of the simulation. The mass 
of a typical gas particle is ~ 2.5 • 10^ M©, when taking into 
account the abundance of hydrogen with respect to helium, we 
need at least 20 gas particles to form a 5 x 10^ M© structure. 
As the neutral fraction is much less than one for most of the 
particles, the number of particles in one object is much larger. 
We find 719 structures above the mass limit and grid a 300 kpc 
box around each object with a cell size of 2 kpc. 

We emphasize that gridding to a higher resolution does not 
mean that the physics is computed at a higher resolution. We 
are still limited by the simplified physics and finite mass res- 
olution of the particles. A method of accounting for structure 
or clun iping below the resolu tion of the simulation is described 
in e.g. iMellema et al.l (|2006|) . To derive the clumping factor, 
they have used another simulation, with the same number of 
particles, but a much smaller computational volume, and thus 
higher resolution. In our analysis, we accept that we cannot re- 
solve the smallest structures, since we are primarily interested 
in the diff'use outer portions of galactic disks. We have chosen a 
2 kpc voxel size, as this number represents the nominal spatial 
resolution of the simulation. The simulation has a gravitational 
softening length of 2.5 kpc but note that the smoothing 
lengths can go as low as 10% of the gravitational softening 
length. 

Distribution functions are plotted for simulated Hi using 
the two diff'erent voxel sizes of 80 and 2 kpc in the left panel of 
Fig. [5]. When using a 80 kpc voxel size, the reconstructed maps 
are unable to resolve structures with high densities, causing er- 
ratic behaviour at column densities above A^hi ~ 10^^ cm"^. 
When using the smaller voxel size of 2 kpc, there is an excellent 
fit to the observed data between about Nui = 10^^ and 10^^-^ 
cm"^. The lower column densities are not reproduced within 
the sub-cubes (although they are in the coarsely-gridded full 
simulation cube), while the finite mass and spatial resolution 
of the simulation do not allow a meaningful distribution func- 
tion to be determined above about Nui = 10^^ cm"^. 

Below Ahi = 10^^ cm"^ a transition can be seen with 
the distribution function becoming flatter. The efl'ect of self- 
shielding is decreasing, which limits the amount of neutral hy- 
drogen at these column densities. Around Nui = 10^^ cm"^ 
the optical de pth to photons at the hydrogen ionisation edge 



tne optical ae ptn to pnotons at tne nyarogen lonisation edge 
is equal to 1 (|Zheng & Miralda-Escudel l2002l). Self-shielding 
no longer has any effect below this column density and a sec- 
ond transition can be seen. Now the neutral fraction is only 
determined by the balance between photo-ionisation and ra- 
diative recombination. The distribution function is increasing 
again as a power law toward the very low column densities of 
the Lyman-alpha forest. The slope in this regime agrees very 
well with the QSO data. Note that the 2 kpc gridded data are 
slightly ofl'set to lower occurrences compared to the 80 kpc 
gridded data. This is because we only considered the vicinity 
of the largest mass concentrations in the simulation for high 



resolution sampling. For the same reason the function is not 
representative below A^hi ~ 3 x 10^^ cm"^, while for the full, 
80 kpc gridded cube it can be traced to A^hi ~ 5 x 10^^ cm"^. 
Of course, lower column density systems can be produced in 
these simulations when artificial spectra are co nstructed (e.g. 
Dave & Trippll200ll: lOppenheimer & DavelEoO Q). but our fo- 
cus here is on the high column density systems that are well- 
described by our gridding approach. 

The distribution functions after gridding to 2 kpc (solid 
line), and the low column density end of the 80 kpc grid- 
ding (dotted line) are plotted again in the right panel of 
Fig. m but now with several observed distributions overlaid. 
The high column density regime is covered by the WHISP data 
(ISwaters et al.ll2002l: iNoordermeer et al.ir2005h in H i emission ; 
a Schechter function fit to this data by IZwaan et al.l (2005) is 
shown by the dashed line. The dash-dotted line shows H i emis- 
sion data from the extended M31 environment after combin- 



ing d ata from a range of diff'erent telescopes ( B raun & Thilker 
20041). Since this curve is based on only a single, highly in- 
clined system, it may not be as representative as the curves 
based on larger statistical samples. Our simulated data agrees 
very well with the various observed data sets. The distribution 
function indicates that there is less H i surface area with a col- 
umn density of A^hi ~ 10^^ cm"^ than at higher column densi- 
ties of a few times 10^^ cm"^. This is indeed the case, which 
can be seen if the relative occurrence of diff'erent column den- 
sities is plotted. In Fig. [6] the fractional area is plotted (dashed 
line) as function of column density on logarithmic scale, which 
is given by: 

A(Nhi) 



fA = 



dlogiNm)' 



(12) 



The surface area first increases from the highest column den- 
sities (which are poorly resolved in any case above 10^^ cm"^) 
down to a column density of a few times 10^^ cm"^, but then 
remains relatively constant (per logarithmic bin). Only below 
column densities of a few times 10^^ cm"^ does the surface area 
per bin start to increase again, indicating that the probability 
of detecting emission with a column density near A^hi ~ 10^^ 
cm"^ is significantly larger compared to detecting emission 
with a column density of A^hi ~ 10^^ cm"^. Also of interest 
are plots of the cumulative H i mass and surface area. 

The solid line in Fig. [6] shows the total surface area sub- 
tended by H I exceeding the indicated column density. The plot 
is normalised to unity at a column density of A^hi = 10^^ cm"^. 
At high column densities the cumulative fractional area in- 
creases only moderately. Below a column density of A^hi ~ 
10^^ cm"^ there is a clear bend and the function starts to in- 
crease more rapidly. At column densities of A^hi ~ 10^^ cm"^, 
the area subtended by H i emission is much larger than at a limit 
of Nui ~ 10^^-^ cm"^, which corresponds to the sensitivity limit 
of most current observations of nearby galaxies. 

4.3. Hi column density 

In Fig. [7] column density maps are shown of the total and the 
neutral hydrogen distribution. The maps are integrated over the 
full 32 Mpc depth of the cube, with the colorbar showing 
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Fig. 5. Left panel: Hi distribution function after gridding to 80 kpc (dashed (red) line). The solid (blue) line corre sponds to data 
gridded to a 2 kpc cell size. Filled dots correspond to the QSO absorption line data ( Corbelli & B andieral l2002h . Right panel: 
Combined H i distribution functions of the simulation, gridded to a resolution of 2 kpc (solid (blue) line) and 8 kpc (dashed 
(purple) line). Overlaid are distribution function from observatio nal dat a of M31 (Bra un & Thi lker 2004), WHISP (ISwaters et al.l 
(l2002|) . IZwaan et al.l (|2005|)) and QSO absorption lines (Corb elli & Bandierall2002h respectively. The reconstructed H i distribu- 
tion function corresponds very well to all observed distribution functions 
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Fig. 6. Fractional area of reconstructed H i (dashed line, right- 
hand axis) and cumulated surface area (solid line, left-hand 
axis) plotted against column density on a logarithmic scale with 
a bin size of dlog(NHi) = 0.2. The probability of detecting 
emission with a column density near A^hi 
nificantly larger than around A^hi 

lated surface area is normalized to that at a column density of 
Nhi = 10^^ cm"^. At column densities of A^hi 



gen map reaches maximum values of A^h ~ 10^^ cm"^, while 
the connecting filaments have column densities of approxi- 
mately an order of magnitude less. In the intergalactic medium, 
the column densities are still quite high, A^h ~ 10^^ cm"^, yield- 
ing a very large mass fraction when the large surface area of the 
intergalactic medium is taken into account. 

In the column density map of neutral hydrogen it can 
be seen that it is primarily the peaks which remain. At the 
locations of the peaks of the total hydrogen map, we can see 
peaks in the Hi map with comparable column densities, that 
correspond to the massive galaxies and groups. The filaments 
connecting the galaxies can still be recognized, but with 
neutral column densities of the order of A^hi ~ 10^^ cm"^. 
Here the gas is still relatively dense, but not dominated by 
self- shielding, resulting in a lower neutral fraction. In the in- 
tergalactic regime, the neutral fraction drops dramatically. The 
gas is highly ionized with neutral columns of only A^hi ~ 10^^ 
cm"^, yielding only a very small neutral mass contribution. 



10^^ cm ^ is sig- 
j^Qi9.5 jj^g cumu- 



Figure [8] shows similar maps chosen from several high- 
resolution regions, gridded to 2 kpc instead of 80 kpc. The left 
panels show a column density map of all the gas, while in the 
middle panels the H i column densities are plotted. The right 
panels show the H i column density distribution function of the 
individual examples. The most complete distribution function 
is obtained by summing the distribution functions of all the in- 
dividual objects, but even the individual distribution functions 
already display the general trend of a flattening plateau around 
A^Hi ~ 10^^ cm"^. Some objects have just a bright core with ex- 
logarithmic column density in units of cm"^. The total hydro- tended emission, like the second example from the top. There 



area subtended by H i emission is much larger than at a limit of 



A^Hi ~ 10 cm , which is the sensitivity limit of most current 
observations of nearby galaxies. 
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Fig. 7. Top left panel: Column density of total H gas integrated over a depth of 32 Mpc on a logarithmic scale, gridded to a 
resolution of 80 kpc. Top right panel: Molecular hydrogen component. Only very dense regions in the total hydrogen component 
contain molecular hydrogen. Bottom panel: Neutral atomic hydrogen component of the same region. In the neutral hydrogen 
distribution the highest densities are comparable to the densities in the total hydrogen distribution, but there is a very sharp 
transition to low neutral column densities as the gas becomes optically thin. Note the very different scales, the total hydrogen 
spanning only 2 orders of magnitude and the neutral hydrogen, eight. 



are many objects with small diffuse companions with maxi- 
mum peak column densities of Nui ~ 10^^ cm"^. These com- 
panions are typically 20-40 kpc in size and are connected with 
filaments that have column densities of Nui ~ 10^^ cm"^ or 
even less. Comparing the plots containing all the hydrogen and 
just the neutral hydrogen it can be seen that the edge between 
low and high densities is much sharper for the neutral hydro- 
gen. The surface covered by column densities of A^hi ~ 10^^ 
cm"^ is much larger than the surface covered by column densi- 
ties of A^Hi ~ 10^^ cm-l 



4.3.1. Neutral fraction 

The neutral fraction is plotted in a particularly instructive way 
in Fig. [21 Neutral fraction of the hydrogen gas is plotted against 
H I column density, where the colorbar represents the relative 
likelihood on a logarithmic scale of detecting a given combi- 
nation of neutral column and neutral fraction. The most com- 
monly occuring conditions are a neutral column density around 
A^Hi -10^"^ cm"^ with a neutral fraction of ~ 10"^, representing 
Lya forest gas. The cutoff at low column densities is artificial, 
owing to our gridding scheme. 

At high densities, Nm > 10^^ cm"^, the gas is almost fully 
neutral and just below A^hi ~ 10^^ cm"^, the neutral frac- 
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Fig. 8. Four examples of high density regions in the reconstructed data, gridded to a cell size of 2 kpc. The left panels show the 
total hydrogen, while the middle panels show only the neutral component. Some objects have many satellites, as in the top panels, 
while others are much more isolated. All examples have extended Hi at column densities around \og(NHi) = 16 - 17 cm"^. In 
the right panel, the individual H i column density distributio n function is shown for each of the examples. Black dots correspond 
to the QSO absorption line data dCorbelli & BandieralEooi) . 
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Fig. 9. Neutral fraction plotted against H i column density. The 
colorbar represents the probability of detecting a certain com- 
bination of H I column density and neutral fraction on logarith- 
mic scale. At the highest densities A^hi > 10^^ cm"^, the neutral 
fraction is almost unity. Column densities around A^hi ~ 10^^ 
cm"^ have the smallest detection probability. 



tion starts to drop very steeply below the 10 percent level. 
This is exactly the column density that is considered to be the 
"edge" of H i galaxies, that defines the border between opti- 
cally thick and thin gas. This transition from high to low neu- 
tral density happens on very small scales of just a few kpc 
dPove & Shulll (Il994 t)). The surface area with column densi- 
ties in the range from A^hi ~ 10^^ to 10^^ cm"^ is relatively 
small. At lower column densities, the probability of detecting 
Hi in any given direction increases. The well-defined correla- 
tion of neutral fraction with neutral column for A^hi > 10^^ 
cm"^ defines a straightforward correction for total gas mass ac- 
companying an observed neutral column density. 

In Fig.[TOlthe cumulative mass is plotted as function of total 
hydrogen column density (left panel) and the column density of 
the Hi gas (right panel). Note that the vertical scale is diff'erent 
in the two panels. The plot is divided in diff'erent regions, the 
galaxies or Damped Lyman-a Absorbers (DLA) are coloured 
light gray. In neutral hydrogen, these are the column densities 
above log(A^Hi) = 20.3. Lower column densities belong to the 
Super Lyman Limit systems (SLLS), or sub-DLAs. In the plot 
showing the neutral hydrogen an inflection point can be seen 
at a column density of log(A^Hi) = 19. This is where the ef- 
fect of self shielding starts to decrease rapidly and the Lyman 
Limit regime begins. At the lower end, below column densities 
of log(A^Hi) = 16 is the Lyman alpha forest, which is coloured 
dark gray. As can be seen there is a huge difl'erence in mass 
contribution for the difl'erent phases, when comparing the neu- 
tral gas against the total gas budget. In H i, about 99 percent of 
the mass is in DLAs, Lyman Limit Systems account for about 
1 percent of the mass and the Lyman alpha forest contributes 
much less than a percent. When looking at the total gas mass 
budget all three components (DLAs, LLSs and the Ly-a forest) 
have approximately the same mass fraction. 
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Fig. 11. Two-point correlation function of Hi-rich objects in 
our simul ation, contrasted wit h a power-law fit to the observed 
relation of lMever etaP (l2007k . 



4.4. Two-Point Correlation Function 

The two-point correlation function measures the degree of clus- 
tering of galaxies in the spatial direction (^(r)), which relates 
dire ctly to the power spectrum through a Fourier transform 
(e.g. lGroth&Pe ebles 1977; Davis & Huchra 1982). The spa- 
tial two-point correlation function is defined as the excess prob- 
ability, compared with that expected for a random distribution, 
of finding a pair of galaxies at a separation ri 2. For Hi, the 



cluste ring is weaker compared to optical galaxies (iMeyer et al 
l2007h . On scales between ~ 0.5 kpc and 12 Mpc, the correla- 



tion function for op tical galaxies has been determined in SDSS 
dZehavi et al.ll2005h and 2dFGS dNorberg et al.ll2002h . For the 
H i-rich galaxies in the HIPASS catalogue, a scale length is ob- 
tained of ro = 3.45±0.25/z"^Mpcandaslopeofr = 1.47+0.08. 

In the past, several estimators have been given for the two- 
point correlation func tion, we will use the Landy & Szalay esti- 
mator as de scribed inlLandv & Szalay (1993) as this estimator 
is used in (iMeyer et al.ll2007h to determine the correlation for 
H I galaxies. This estimator is given by: 



^LS = ^[DD-2DR^RR] 
KK 



(13) 



where DD are the galaxy-galaxy pairs, RR the random-random 
pairs and DR the galaxy-random pairs. This estimator has to 
be normalised with the number of correlations in the simulated 
and random distributions: 



1 

^LS - — 



DD 



2DR 



{ndijid - l))/2 nrfid 



+ RR 



(14) 



where rid is the number of detections or simulated objects and 
Hr is the number of galaxies in the random sample. 

In Fig. [TT] the two-point correlation function is plotted; 
the black dots represent the values obtained from the simu- 
lation, while the dashed red line corresponds to the correla- 
tion function that i s fit to galaxies in the HIPASS catalogue by 
(IMeyer et al.ll2007h . The solid line is our best fit, with a scale 
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Fig. 10. Left panel: Cumulative mass of total hydrogen as function of column density, the different phases are shown with different 
colours. The Lya forest, the Lyman Limit System and galaxies have approximately the same mass component when considering 
all the gas. Right panel: Cumulative mass of neutral atomic hydrogen as function of column density. Although the surface covered 
by the LLSs is large, it contains only 1% off all the neutral gas mass, while about 99% resides in the galaxies 



length of ro = 3.3 ± 02h-^ Mpc and a slope of y = 1.7 ± 0.2, 
only data points where the radius is smaller than 6 Mpc have 
been used for the fit. 

There is very good correspondence between the simulated 
and observed H i-correlation functions on scales between -0.5 
Mpc and ~ 5 Mpc. Accuracy at smaller scales is limited by 
the finite resolution of the simulation. On the other hand, the 
representation of large scales is limited by the physical size of 
the box. In a 32 /z"^ Mpc box, the largest well-sampled struc- 
tures are ab out 5 Mpc in size. This difference is not surpris- 
ing, because Mever et al. (2001) are able to sample structures 
up to 10 Mpc, give n their significantly larger survey volume. 
Meyer et al.l (|2007|) also looked at a limited sample of galax- 



ies, applying the parameter cuts Mhi > 10^ Mq and D < 30 
Mpc. This limited sample is very similar to our sample of sim- 
ulated objects and the power law parameters in this case are 
ro = 3.2 ± lAh~^ Mpc and y = 1.5 ± 1.1. Although the errors 
are larger, the results are very similar to the full sample and in 
excellent agreement with our simulations. 

4.5. Hi Mass Function 

The H I Mass Function O(Miii) is defined as the space density 
of objects in units of M pc~^ . For fitting purposes a Schechter 
function (ISchechtejll976l) can be used of the form: 



^(MhiMMhi = ^(— ) exp(-— )^ (15) 



characterised by the parameters a, M^j 



and which define 
the slope of the power law, the H i mass corresponding to the 
"knee" and the normalisation respectively. In a logarithmic 
form the H i Mass function can be written as: 



6/(Mhi) = ln(10)6/* 



a+l 



exp 



-Mhi \ 

iKZ JTT / 



(16) 



The reconstructed structures in our high resolution grids 
can be used to determine a simulated Hi Mass Function for 
structures above -5x10^ M©. The result is plotted in the left 
panel of FigIT2l where the H i Mass function is shown with 
a bin size of 0.1 dex. Overlaid is the best fit to the data, the 
fitting parameters that have been used are 6* = 0.059+0.047, 
log(M^j) = 9.2 ± 0.3 and a = -1.16 ± 0.45. Note that in this 
case the value of a is not very well-constrained, as this parame- 
ter defines the slope of the lower end of the H i Mass Function, 
but our simulation is unable to sample the mass function below 
a mass of Mhi ~ 5 xlO^M©. The result is compared with the H i 
mass function from Zwaan et al. 112003) (dash-dotted line). The 
reconstructed mass function corresponds reasonably well with 
the mass functions obtained from galaxies in HIPASS around 
M*. At masses around 10^^ M©, the error bars are very large 
due to small number statistics. A much larger simulation vol- 
ume is required to sample this regime properly. There is a hint 
of an excess in the simulation near our resolution limit, this 
may simply reflect cosmic varia nce and will be addressed in fu- 
ture studies. Izwaan et all (12003 ) compared four different quad- 
rants of the southern sky and found that at Mhi ~ lO^M© the 
estimated space density varies by a factor of about three, which 
is comparable to the factor ~ 2 difference we see between the 
simulation and observations. 



4.5.1. H2 Mass Function 

The H2 mass Function can be determined in a similar way 
as the Hi Mass Function. The result is shown in the right 
panel of Fig. [12] where the simulated data points are fitted 
with a Schechter function. Our best fit parameters are 0* = 
0.036+0.036, log(M^p = 8.7 + 0.3 and a = -1.01 + 0.57. 
At the high end of the mass function the results are affected by 



low number statistic s. The simulated fit is compared with the 
fits as determined by lObreschkow & Rawlings ((2009h (dashed 
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Fig. 12. Left panel: Hi Mass Function of the simu lated data (black dots) with the best-fit Schechter function (solid black line), 
compared with the HIMF from lZwaan et alJ ( 2003 ) (dash-dotted red line). Our best fit line is dashed below lO^M©, because there 
are no data points there and the function is only an extra polation . Right panel: H2 Mass Function of the simulated data (black 
dots) a nd the best fit, compared with the H2MF from Ker es et aD (2003) (dash-dotted (blue) line) and Obreschkow & Rawlingi 
(l2009h (dashed (red) line). Both simulated Mass Functions show agreement with observations over about a decade in mass. 




Fig. 13. Derived H2 masses are plotted against H i masses for 
individual simulated objects. The distribution can be fit using a 
simple power law (solid (red) line), although the scatter is very 
large. The dashed vertical line represents the sample cut-off' for 
H I masses in the simulated data. 



line) and iKeres et al. I ('2003') (dash-dotted line). There is very 
good agreement over the full mass range. 

In Fig. [13] the derived H2 masses are plotted as function of 
Hi mass. The dashed vertical line represents the completeness 
limit of H I masses. The data can be fitted using a power law 
(solid line) which looks like = {Mnilmof with a scaling 
parameter of mo = 158 ± 43 and a slope ofj3= 1.2 + 0.1. 



4.6. Stars, Dark Matter and Molecular Hydrogen 

In addition to the SPH-particles, the simulations also contain 
dark matter and stars. The distribution of these components can 
be reconstructed and compared with the distribution of neutral 
hydrogen. For reconstructing the dark matter and stars a very 
simple adaptive gridding scheme has been used. This gridding 
scheme was adopted, because the dark matter and star parti- 
cles do not have a variable smoothing kernel like the gas parti- 
cles. They do have a smoothing kernel defined by the softening 
length of 2.5 kpc however this is a fixed value. The grid- 
ding method as described in section 3.3 using a spline kernel 
can not be used for this reason. Exactly the same regions have 
been gridded as those previously determined Hi objects. First 
the objects have been gridded with a cell size of 10 kpc using a 
nearest grid-point algorithm. The resulting moment maps have 
been used to determine which high density regions contain 
many particles. In a second step the particles have been gridded 
to five independent cubes using a 2 kpc cell size. Based on the 
density of simulation particles in the 10 kpc resolution cube, an 
individual particle is assigned to one of five diff'erent cubes for 
gridding. The density threshold is determined by the number of 
particles in the 10 kpc resolution cube integrated along the line 
of sight. The threshold numbers are 2, 6, 18, 54 and everything 
above 54 particles respectively. The five cubes are integrated 
along the line of sight and smoothed using a Gaussian kernel 
with a standard deviation of 7, 5, 3.5, 2.5 and 1.5 pixels of 
2 kpc respectively. Finally the five smoothed maps are added 
together. These smoothing kernels where chosen to insure that 
each individual cube covering a diff'erent density regime has a 
smooth density distribution, but preserves as much resolution 
as practical. 

In Figure [14] four examples are shown of the dark matter 
and stellar distributions overlaid on the of Hi column den- 
sity maps. The right panels in this image show the contours 
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of molecular hydrogen. The stars are concentrated in the bright 
and dense parts of the H i objects corresponding to the bulges 
and disks of galaxies. The third example shows an edge-on ex- 
tended gas disk (its thickness is likely an artifact of our nu- 
merical resolution). The neutral hydrogen is much more ex- 
tended than the stars, and the smaller, more diffuse H i clouds 
do not have a stellar counterpart in general. Many objects have 
H I satellites or companions, as in the first two examples. These 
companions or smaller components do not always have a stellar 
or molecular counterpart, although the H i column densities can 
reach high values up to Nm ~ 10^^ cm"^ as seen in Galactic 
high velocity clouds. 

Interestingly, these H i clouds only occasionally trace dark 
matter substructures, hence in many cases they are not obvi- 
ous large-scale accretion events (since the most massive ac- 
creting clumps should be accompanied by dark matter). The 
origin of these diff'use clouds, perhaps analogous to high ve- 
locity clouds, may be from a "halo fountain" of gas cycling in 
and out of galaxies ow in^ to galactic outflows, as speculated in 
Qppenheimer & Dava (|2QQ8|) . or represent more difl'use accre- 
tion from the extended environment. Studying the kinematics 
and metallicities of these clouds may reveal signatures of their 
origin. In the future we plan to investigate such signatures in 
the simulations to assess how observations of difl'use H i clouds 
around galaxies can inform our understanding of the processes 
of galaxy assembly. 

5. Discussion 

To make observational predictions based on numerical simu- 
lations, the first essential step is to establish that the simu- 
lation can adequately reproduce all observational constraints. 
We have carried out a critical comparison of our simulated H i 
data with observations using a wide range of statistical mea- 
sures. Essential in creating simulated data is the minimization 
of the number of free parameters over and a bove the many that 
are already inherent in the simulation; ( Opp enheimer & Dave 
20061 120081) . In our analysis, the only additional assumptions 
we make are that the transitions from ionized to atomic and 
from atomic to molecular hydrogen can be described by a sim- 
ple threshold eff'ect at two diff'erent values of the thermal pres- 
sure, while demanding that the recombination time be less than 
the sound-crossing time. The threshold values are determined 
by fixing the average H i density and the density ratio VIh2 I^hi 
at z = to those determined observationally. In choosing this 
simple prescription we are strongly limited by current numer- 
ical capabilities. Although we did not solve the complete ra- 
diative transfer problem, we do get quite complex behaviour 
emerging. The range of threshold values we explored are con- 
sistent with expected values in literature. 

The statistics of the reconstructed and observed H i distri- 
butions agree quite well, making it plausible that the associated 
H I structures in the simulation may be similar to those ocurring 
in nature. The simulation cannot reproduce structures that re- 
semble actual galaxies in detail. Besides the finite mass resolu- 
tion of the SPH-particles of ~ 10^ M©, there are the inevitable 
limitations on the included physical processes and their practi- 
cal implementation. Nonetheless, we may begin to explore the 



fate of partially neutral gas in at least the diff'use outskirts of 
major galaxies. 

Despite the limitations, the simulations can reproduce 
many observed statistical aspects of H i in galaxies, which is 
very encouraging for further exploration of this approach. The 
adopted self- shielding threshold provides good results for this 
one simulation. Future work will test the variations that are 
encountered with diff'erent feedback mechanisms. The method 
can also be applied to look at the evolution of neutral hydrogen 
with redshift. Furthermore, mock observations can be created 
to make predictions for what future telescopes should detect. 
This is particularly relevant for assessing performance require- 
ments for the various facilities now under development with a 
strong Hi focus, such as the Square Kilometre Array (SKA) 
and many of the SKA pathfinders. 

5.1. Resolution Effects 

Simulations are limited by their volume and the mass resolu- 
tion of the particles (over and above the limitations that re- 
sult from incomplete physics). Although we are able to recon- 
struct structures of several Mpc in scale, the simulated volume 
is relatively small. To be able to reconstruct the largest struc- 
tures encountered in the universe, and eff'ectively overcome 
cosmic variance, a simulated volume is needed that is about 
about 300 (rather than 32) Mpc on a side. In the current vol- 
ume, the most massive structures are suff'ering from low num- 
ber statistics. A drawback of using a larger volume is that the 
mass resolution of individual particles decreases rapidly, mak- 
ing it impossible to resolve structures on even multi-kiloparsec 
scales. The approach we have employed to approximate the 
eff'ects of self- shielding is extremely simple. The actual pro- 
cesses acting on sub-kiloparsec scales are undoubtedly much 
more complex. Clumping will occur on the scales of molecu- 
lar clouds, which will dramatically increase the local densities. 
The threshold thermal-pressure we determine to approximate 
the atomic to molecular hydrogen transition only has possible 
relevance on the kiloparsec scales of our modelled gas parti- 
cles. At the smaller physical scales where the transition actu- 
ally occurs, the physical pressures will likely be substantially 
diff'erent. 

We note that realistic simulations of cosmological volumes 
are extremely challenging, and that even the current state-of- 
the-art is not particularly successful at reproducing objects that 
resemble observed galaxies in great detail. The simulation we 
employ represents a very good compromise between simula- 
tions focused on larger and smaller scales. We are mainly inter- 
ested in the diff'use intergalactic structures on multi-kiloparsec 
scales, that would be observable when doing 21cm Hi obser- 
vations with sufficient sensitivity. We have enough resolution 
to map and resolve these structures and to reconstruct the ex- 
tended environments of galaxies and galaxy groups. Our sim- 
ulation is not suitable for making predictions about the inner 
cores of galaxies or resolving the star formation process in 
molecular clouds. 

Future work will test our analysis method on both larger 
and smaller scales. Simulations in a larger volume can pro- 
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Fig. 14. Column density maps of four reconstructed objects as seen in neutral hydrogen with contours of Dark Matter (left panels), 
Stars (middle panels) and molecular hydrogen (right panels). For both the Dark Matter and the stars contour levels are at = 3, 
5, 10, 20, 30, 50 and 100 xlO^ M© kpc'^. For the molecular hydrogen contours are drawn at Nh^ = 10^^ 10^^ 10^^ and 10^^ 
cm"^. Stars are concentrated in the very dense parts of the Hi objects, dark matter is more extended, however the extended Hi 
does not always trace the dark matter. The H i satellites or companions are within the same Dark Matter Halo, but do not always 
contain stars. 



vide a more sensitive test of the reconstructed H i Mass function volume will probably require a more advanced method of mod- 
and the two-point correlation function. Simulations in a smaller elling the atomic to molecular hydrogen transition. However, 
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substantial insights into the more diffuse phenomena, such as 
accretion and feedback processes around individual galaxies, 
are very likely within reach. 

5.2. Future Observations 

This simulation can not only be used for getting a better un- 
derstanding of the Hi distribution, especially at low column 
densities, but is also very suitable for making predictions for 
future observations. Currently many radio telescopes are being 
built as pathfinders toward the Square Kilometre Array (SKA). 
A few examples are the Australia SKA Pathfinder (ASKAP), 
the Allen Telescope Array (ATA), Karoo Array Telescope 
(MeerKAT) and the Low frequency array (LOFAR), and many 
more. Although the SKA will be the final goal, each of these 
telescopes is a good detector on its own and is planned to be op- 
erational in the relatively near future. Of course each telescope 
will have diff'erent characteristics, but in general it will be pos- 
sible to do surveys deeper, faster and over a broader bandwidth. 

We will discuss the simulated maps of two current sin- 
gle dish telescopes, Parkes and Arecibo, and compare these 
with the capabilities of two future telescopes, the ASKAP 
and the SKA. Parkes and Arecibo are both single dish tele- 
scopes with a multi-beam receiver that have recently been 
used to d o large area surveys , the H i Parkes All Sky Survey 
(HIPASS, iBarnes et all (1200 1|)) and the A r ecibo Legacy Fast 
ALFA Survey f ALFALFA loiovanelli et aP (l2005b ). 

To make a fair comparison, all four telescope will get 
500 hours of observing time to map 30 square degrees of 
the sky. These numbers are chosen as 500 hours is a reason- 
able timescale to make a deep image and a 30 square degree 
field is needed to map the extended environment of a galaxy. 
Furthermore, 30 square degrees is the planned instantaneous 
field-of-view of ASKAP. 

We focus on two cases that illustrate the capabilities of 
present and upcoming telescopes. In the first example, observa- 
tions will be simulated at a distance where the beam of the tele- 
scopes has a physical size of 25 kpc. This approximate beam 
size is needed to resolve diff'use filaments and extended com- 
panions from the simulation. In the second example observa- 
tions will be simulated at a fixed distance of 6 Mpc, which is 
the limiting distance for the Parkes telescope according to the 
above argument. 

Right ascension and declination coordinates are added 
according to the distance, centered on an RA of 12 hours 
and a Dec of degrees. For each telescope the sensitivity 
is determined that can be achieved using the given con- 
ditions. The technical properties are listed in Table [T] For 
the Parkes telescope we use the sensitivity that is achieved 
when re-reducing HIPASS data (Popping 2009 in prep.), 
which corresponds to ~ 8mJy/Beam over 26 km s"^ for 
a typical HIPASS field after integrating over ~ 560s per 
square degree. For the Arecibo telescope we used the sen- 
sitivity of the Ar ecibo Galaxy Environment Survey (AGES) 



initial Array Configuration paper which can be found on 
http://www. atnf. csiro. au/projects/askap/newdocs/configs-3.pdf, 
ASKAP is expected to achieve a sensitivity of 7.3 mJy/beam 
over 21 km s"^ after one hour of integration time. For the 
SKA we assume a Ae/f/Tsys = 2000 m^ K"^ at 2.5 arcmin 
resolution, which is 20 times higher than ASKAP. Furthermore 
we assume a field of view similar to ASKAP, 30 square degrees 
instantaneous. 

All the flux densities can be converted into brightness tem- 
perature using the equation: 



2ka 



(17) 



where X is the observed wavelength, S is the flux density, k 
the Boltzmann constant and Q is the beam solid angle of the 
telescope. When using the 21 cm line of H i, this equation can 
be written as: 

606 ^ 



(18) 



where bmin and bmaj are respectively the beam minor and major 
axis in arcsec and S is the flux in units of mJy/Beam. The inte- 
grated 21cm line intensity can directly be converted into an H i 
column density using: 



( Auld et al.l 1200 6) assuming 0.95 mJy/beam over 10 km 



s ^ after integrating for 10 hours per square degree. The 
expected sensitivities for ASKAP are described in the 



Nhi = 1.823- 10^^ J" Tbdv (19) 
with [Nhi] = cm'^ [Tt] = K and [dv] = km s"^. 

The second column in Table [T] gives the beam size of each 
telescope, the third column gives the distance at which the 
beam has a physical size of 25 kpc, and the fourth column gives 
the physical beam size at a distance of 6 Mpc. The sensitivities 
in column five are converted to a column density limit when 
sampling a line of approximately 25 km s"^ width in the last 
column. We assume that in any analysis only the channels will 
be selected containing diff'use emission and that the line width 
of diff'use regions will be of the order of 25 km s"^ Galaxies 
can have a much larger linewidth, however detecting these is 
not an observational challenge. The reconstructed maps have 
an intrinsic resolution of 2 kpc with a minimum smoothing ker- 
nel size of three pixels. This yields an initial beam size of 6 
kpc, which will be smoothed to the appropriate beam size of 
each telescope, corresponding to the simulated distance. Using 
the calculated sensitivity limits random gaussian noise is gen- 
erated and added to the maps. The steps are shown in FigurefT?! 
the left panel shows the original reconstructed column density 
map that is smoothed to the beam of ASKAP in the middle 
panel. At this stage most of the diff'use and extended emission 
can still be recognized. Noise is added in the right panel, most 
of the diff'use emission disappears in the noise. 

In Figure[T6lthe same field is shown as in FigureO placed 
at a distance where the beam has a size of 25 kpc. This means 
that the object looks similar for all four telescopes as it fills the 
same number of beams, although there is a big diff'erence in 
angular scale. 

For each panel contour levels are drawn starting at 3 cr and 
increasing as noted in the figure caption, so the actual values 
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Telescope 


Beam (arcmin) 


D (beam 25 kpc) 


Beam at D = 6 Mpc 


RMS (25 km s'O 


Nhi 


(25 km s-^) 


Parkes 


14.4' 


6 Mpc 


25 kpc 


0.9 mJy/Beam 


3.3 


• 10^6 cm-2 


Arecibo 


3.5' 


25Mpc 


6.1 kpc 


0.45 mJy/Beam 


2.9 


• 10^^ cm-2 


ASKAP 


3' 


28 Mpc 


5.2 kpc 


0.3 mJy/Beam 


2.6 


• 10^^ cm-2 


SKA 


2.5' 


33 Mpc 


4.4 kpc 


0.015 mJy/Beam 


1.8 


• 10^6 cm-2 



Table 1. Sensitivity limits for four different telescopes for an assumed linewidth of 25 km s ^ after a total integration time of 500 
hours to image an area of 30 square degrees. 

1 — ' — - 



12^04°^ 12''02™ 12'>00'" 11*^58'" ll"56" 12''04'" 12'>02™ 12'>00'" 1 1''58™ ll"56™ 12''04" 12''02™ \2^Q(i^ 1 1'^SB"' ll''56™ 

Right Ascension (J2000) Right Ascension (J2000) Right Ascension (J2000) 

Fig. 15. Left panel: simulated object at a distance of 6 Mpc, with a 6 kpc intrinsic beam size. Middle panel: simulated object 
smoothed to a 8.7 kpc beam size, corresponding to the ASKAP beam at 6 Mpc. Right panel: noise is added corresponding to the 
ASKAP sensitivity limit after a 500 hour observation of 30 square degrees. 




are different in each panel, but can be determined using the 
sensitivity limits given in table. [T] All panels look very similar, 
as most of the structure and substructure of the original map 
is smoothed into two large blobs and essentially all the diffuse 
structures are lost in the noise. Only with the SKA can some ex- 
tended contours still be recognized at the top of the left object. 
However, it is very difficult to distinguish extended emission 
from companions, unless the companions are clearly separated 
by at least a beam width from the primary object. 

An example of this can be seen in Figure [T7] where the 
same object is shown simulated with four telescopes but now 
at a similar distance of 6 Mpc. A smaller beam size now 
yields higher resolution and more detected structure. The dif- 
ferences between the four panels are now obvious. Observed 
with Parkes in the top left panel the object has essentially no 
resolved structure. Clearly the beam size is too large and only 
suitable for very nearby galaxies, closer than 6 Mpc. Arecibo, 
with a much smaller beam, can resolve the inner core of the 
object and can just detect the brightest companion. Again the 
contour levels have values starting at 3 cr, so the outer contour 
in the top right panel corresponds to 9 • 10^^ cm"^. ASKAP 
has a very similar sensitivity limit as Arecibo with a compa- 
rable beam size. However it is notable that it reaches Arecibo 
sensitivities with a much smaller collecting area. Furthermore, 
these deep integrations have not really been explored, so the 
given sensitivities are theoretical limits. It is very likely that 
correcting for systematic effects, like the shape of the spectral 
bandpass, will be more achievable with an interferometer like 
ASKAP rather than a large single dish telescope. In the SKA 
image essentially all companions can be clearly distinguished 
down to a contour level of ~ 4 - 10^^ cm"^. Note that this value is 



lower than the 3(T value in Table[TJ this is because the beam size 
of the SKA is smaller than the beam size of the simulation at a 
distance of 6 Mpc. To adjust for this we adopt the beam size of 
the reconstructed data and smooth the noise to this larger beam 
size. In this case the noise value will decrease, resulting in a 
higher sensitivity of 1.3 • 10^^ cm"^. 

6. Conclusion 

We have used a hydrodynamic simulation to predict the neutral 
hydrogen distribution in the universe. The simulation employs 
a random cube of 32 Mpc on a side at redshift zero, with an 
SPH mass resolution of ~ 10^ M©. The physics in the simula- 
tion includes subgrid treatments of star formation and feedback 
mechanisms. 

We have developed a method to extract the neutral hydro- 
gen component from the total gas budget. At low volume den- 
sities the balance is calculated between photo-ionization and 
radiative recombination. For high densities a correction has to 
be applied for self- shielding, as the gas becomes optically thick 
for ionizing photons. In the densest regions, the atomic hydro- 
gen is turned into molocular hydrogen that will subsequently 
form stars. The molecular hydrogen and the self- shielding tran- 
sition are both modelled by a critical thermal pressure. Above 
the first threshold limit {PIk = 155 cm"^ K), the gas is as- 
sumed to recombine and the neutral fraction is set to unity. At 
even higher pressures (P/k = 810 cm"^ K), the atomic hydro- 
gen is assumed to become molecular hydrogen, so the atomic 
fraction becomes zero. These processes only apply to simu- 
lated particles for which the recombination time is shorter than 
the sound-crossing time on kpc scales. The two threshold pres- 
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Parkes 6 Mpc Arecibo 35 Mpc 
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Fig. 16. Simulated observations of the same object at a distance at which the beam size corresponds to 25 kpc for Parkes (top 
left), Arecibo (top right), ASKAP (bottom left) and the SKA (bottom right). All contours begin at a 3 cr level, after a 500 hour 
observation of a 30 square degree field with each telescope. Every subsequent contour level is a factor 3 higher than the previous 
one. Note that the angular scale is diff'erent in each panel. 



sures are tuned to reproduce the observed average H i density of 
Phi = 6.1 X 10^/z M© Mpc"^ and an assumed molecular density 
of = 1.8 X 10^ /z Mq Mpc"^, corresponding to a molecular 
to atomic density ratio of t/^^o = 0.3. 

A wide range of statistical comparisons have been made 
between the reconstructed H i distribution and existing observa- 
tional constraints including: the two-point correlation function, 
the H I mass function and the H i column density distribution. 
There is agreement between all these statistical measures of the 
observations and the simulations, which is a very encouraging 
result. Based on this agreement, the simulated Hi distribution 
may be a plausible description of the H i universe, at least on 
the intermediate spatial scales that are both well-resolved and 
well-sampled. 

We also compare the distribution of neutral and molecu- 
lar hydrogen with the distribution of dark matter and stars in 



the simulation. Massive H i structures generally have associated 
stars, but the more diff'use clouds do not contain large stellar 
components or in many cases even concentrations of dark mat- 
ter. The method to extract neutral hydrogen from an SPH output 
cube can be applied to other simulations, to allow comparison 
of diff'erent models of galaxy formation. Furthermore, the re- 
sults can be used to create mock observations and make pre- 
dictions for future observations. This preliminary study shows 
that as H i observations of diff'use gas outside of galactic disks 
continue to improve, simulations will play a vital role in guid- 
ing and interpreting such data to help us better understand the 
role that H i plays in galaxy formation. 
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Fig. 17. Simulated observations of the same object at a fixed distance of 6 Mpc for Parkes (top left), Arecibo (top right), ASKAP 
(bottom left) and the SKA (bottom right). Contour levels start at a 3 cr level after 500 hour observation of a 30 square degree 
field. For Parkes every subsequent contour is a factor 3 higher than the previous one. For ASKAP and Arecibo, the contours 
interval is a factor 7. For the SKA the contours start a 4 • 10^^ cm"^ and increase by a factor 6. Parkes is not really competitive 
in detecting substructures at this distance. The other telescopes all have sufficient resolution, however only the SKA is sensitive 
enough to detect the faint diff'use sub- structures. 
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